Running a basic command, create an object, plot the object
With R, you can do so many things! For example, you can perform simple arithmetic operations or run complex statistic analysis, or create beautiful reports and even websites.
## 1.1 Write a simple command ----
2+2 # Run this command and look at pane 2
## [1] 4
Assign a name to your command.
This action will become very useful. You assign names to objects so that then you can perform operations with these objects, such as plotting.
## 1.2 Create an object using the assign symbol (<-)
a <- 2+2 # Run this command and look at pane 3
Plot your variable ‘a’.
## 1.3 Let's plot our object
plot(a) # This appears on pane 4
R has readily available data sets for you to explore, play, analyse, plot…
Run this command if you want a list of what’s available:
data()
There are a few that are commonly used in R-Training sessions. You will encounter these frequently:
Iris data set
This data set holds information about 50 flowers of 3 species of iris.
If you want to know more about this data set.
?iris # Look at pane 4
Note how we used the ? symbol to get help. You can also get help by running:
help("iris")
Let’s open the “iris” data set. With built in data, you just need to type the name of the data set.
head(iris) # Look at pane 3
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
To view the data set, use the command View. Bear in mind that R is case-sensitive. If you type view instead of View, you’ll get an error.
View(iris)
Get a summary of the variables using the command summary. This command gives you an overview of the variables in your data set.
summary(iris)
## Sepal.Length Sepal.Width Petal.Length Petal.Width
## Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100
## 1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300
## Median :5.800 Median :3.000 Median :4.350 Median :1.300
## Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199
## 3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800
## Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500
## Species
## setosa :50
## versicolor:50
## virginica :50
##
##
##
Let’s plot one of the variables. This data set has 5 variables: the length and width of the sepals, the length and width of the petals and the species. If you use the $ symbol, you are telling R to look for a specific variable within your data set.
plot(iris$Sepal.Length)
Now:
When you work with R, you use data sets that are saved somewhere in your computer.
R doesn’t know where they are.
We must tell R in what folder we are keeping our files. So, we set up the working directory.
Now… where did you save the material for this course?
setwd("/Users/roldix/Library/Mobile Documents/com~apple~CloudDocs/GitHub_Repos/r-training") # This is where my files are. Change the path accordingly and run this command.
Always set up a working directory when starting a new script.
Open a data set that’s not built into R. And name it so that you can call it later on.
df<-read.csv("data/worms.csv")
Try completing the following tasks using the commands you learned today.
View the data
Get a summary of the data
How many variables and observations are there in your data?
Plot a variable
You have completed your First Steps in R and RStudio training and you are ready for Session 2: Data Types and Structures!
Always keep in mind these Good Practices.
Commenting (#). A script is a conversation with your future self.
Outline. Use the ouline feature to create documents with a structure.
Set up a working directory. Use a command so that when you reopen your script, you know where everything is.
The devil is in the details The most likely explanation for an error message is that you missed something small like a comma. Check for these missing details before stressing out.
Google it out If you want to do something complicated chances are somebody else has tried before. Google for solutions to your problems. If there is no solution, use Stackoverflow.
Save your workspace… it will be handy later.
One script per job. When you become more proficient, you scripts will grow considerably. Try to create separate scripts, especially for data cleaning process.
Create a pseudocode. Start your script by setting up the titles of your sections. Then progresively, populate the sections with subtitles and lastly, fill out your code with commands. Normally, I would add the sections: Set up, Data, Data Cleaning, Data analysis, Data plotting, and Wrap up.
R is a whole universe of free resources and filled with wonderful people willing to help. Here is a list of some links you will find handy
R Tutorials R Books
Stackoverflow In this platform you can pose questions regarding your code and people will give you answers.
Cheatsheets You will find these cheatsheets useful further down the line.
In R Training Session 2 - Datatypes & Structures, you will use your newly acquired skills to take it to the next level and learn about the different data types and structures. By the end of session 2, you will be able to:
In this session you have:
These are all the lines of code that you used with additional examples to explore R in more depth:
# Title of the script:
# Author:
# Date:
# set a working directory
#setwd("copy here the path to the folder where you will keep your code and other files associated")
# 1. Simple commands ----
2+2
## [1] 4
## Other arithmetic operations
3*2
## [1] 6
1686/5
## [1] 337.2
1:10 # numbers from 1 to 10
## [1] 1 2 3 4 5 6 7 8 9 10
# 2. Create an object using the assign symbol (<-) to give it a name
a<-2+2
anyname <- 3*2 # spaces do not matter
1686/5 -> b # you can also invert the order
words <- "I can be text"
number_sequence <- 1:10
# And you can also do operations with your objects
a + anyname
## [1] 10
# or concatenate them using the function c(). This is important, you will use it all the time!
together <- c(words, a, b)
together # Run the name of the object and take a look
## [1] "I can be text" "4" "337.2"
# 3. Plot your object
plot(a)
# 4. Using built-in data
data() # list of all the data sets available
?iris # fetch information about anything using ? or ?? or help
head(iris) # get a glimpse of your data by looking at the first 6 observations in your data
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
tail(iris) # you can also look at the last 6 observations
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 145 6.7 3.3 5.7 2.5 virginica
## 146 6.7 3.0 5.2 2.3 virginica
## 147 6.3 2.5 5.0 1.9 virginica
## 148 6.5 3.0 5.2 2.0 virginica
## 149 6.2 3.4 5.4 2.3 virginica
## 150 5.9 3.0 5.1 1.8 virginica
View(iris) # use this command to look at the whole data set. Bear in mind that it's case sensitive.
## Warning in system2("/usr/bin/otool", c("-L", shQuote(DSO)), stdout = TRUE):
## running command ''/usr/bin/otool' -L '/Library/Frameworks/R.framework/Resources/
## modules/R_de.so'' had status 1
summary(iris) # get summary information about your variables.
## Sepal.Length Sepal.Width Petal.Length Petal.Width
## Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100
## 1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300
## Median :5.800 Median :3.000 Median :4.350 Median :1.300
## Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199
## 3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800
## Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500
## Species
## setosa :50
## versicolor:50
## virginica :50
##
##
##
plot(iris$Sepal.Length) # plot a specific variable from your data.
# The $ symbol fetches a variable within your data set.
summary(iris$Sepal.Width) # you can use the $ with other commands in R.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.000 2.800 3.000 3.057 3.300 4.400
# 5. Using a data file from outside R
df <- read.csv("data/worms.csv")
View(df)
## Warning in system2("/usr/bin/otool", c("-L", shQuote(DSO)), stdout = TRUE):
## running command ''/usr/bin/otool' -L '/Library/Frameworks/R.framework/Resources/
## modules/R_de.so'' had status 1
summary(df)
## Field.Name Area Slope Vegetation
## Length:20 Min. :0.800 Min. : 0.00 Length:20
## Class :character 1st Qu.:2.175 1st Qu.: 0.75 Class :character
## Mode :character Median :3.000 Median : 2.00 Mode :character
## Mean :2.990 Mean : 3.50
## 3rd Qu.:3.725 3rd Qu.: 5.25
## Max. :5.100 Max. :11.00
## Soil.pH Damp Worm.density
## Min. :3.500 Mode :logical Min. :0.00
## 1st Qu.:4.100 FALSE:14 1st Qu.:2.00
## Median :4.600 TRUE :6 Median :4.00
## Mean :4.555 Mean :4.35
## 3rd Qu.:5.000 3rd Qu.:6.25
## Max. :5.700 Max. :9.00
dim(df)
## [1] 20 7
plot(df$Worm.density)
# 6. Save your work
# Save the changes to your script
# and also, you can save the work space with all the objects you created and the data you used. Pretty much, all that there is in your environment section in Pane 3.
save.image("give_me_a_name.RData") # this will create an RData file that will keep your data files, script
Thank you!
Any feedback or comments
Licensed under CC-BY 4.0